In the paper we give values of minimal detectable power for detectors. These values are based on some statistical analysis of the measurements. We would like to show confidence intervales for these values to give more confidence for the reader that enough repetitions were performed to give statistically meaningful results.

$P_{in-min}$ was determined in three steps. Each of these steps introduces some uncertainty.

  1. $\gamma_0$ was determined from PDF of $N_p$ measurements of $\gamma$ with no signal (and set $P_{fa}$. Here, uncertainty in PDF due to finite $N_p$ results in uncertainty in $\gamma_0$ estimate.
  2. $P_d$ was determined for each $P_{in}$ based on $N_p$ trials using maximum likelyhood estimate. Finite $N_p$ means uncertainty in the MLE.
  3. Final $P_{in-min}$ value was determined from estimated $P_d(P_{in})$ using linear interpolation. Interpolation introduces some error as well.

How to estimate uncertainties in each of these estimates?

  1. TBD

    https://en.wikipedia.org/wiki/Empirical_distribution_function

    https://stats.stackexchange.com/questions/15891/what-is-the-proper-way-to-estimate-the-cdf-for-a-distribution-from-samples-taken

  2. Confidence interval for MLE can be calculated from the beta distribution.

  3. Bounds of the interpolation error can be given based on the assumption that $P_d(P_{in})$ is a monotonic funcion.

How to account for all of these uncertainties and derive the uncertainty of the final $P_{in-min}$ estimate? TBD

A simple solution would be to ignore 1 and 3 and only estimate 2. Can we check that that is a good approximation of the uncertainty?

Here we do a Monte Carlo simulation and compare its results with estimate 2.

In the simulation we only account for uncertainties 1 and 2. Interpolation is not accounted for.

Setting up the test

We will perform the simulated measurement of $P_d$ for a single $P_{in}$. We are most interested in the uncertainty for $P_d$ that is near the set $P_{d-min}$. Hence we must first find the $P_{in}$ to run the simulation at.

To do this, we first calculate probability of detection for a range of input powers, in the same way we do in the Evaluation of ... notebook.


In [19]:
Pconf = 1-1e-6
Pfa = 0.1

In [20]:
from scipy.special import betaincinv

def get_edf(x):    
    xs = array(x)
    xs.sort()
    N = float(len(xs))
    P = arange(N)/N
    
    return xs, P

def get_gamma0(gammaN):
    gammaN, edf = get_edf(gammaN)
    gamma0 = interp(1. - Pfa, edf, gammaN)
    
    return gamma0

def get_Pd(gamma, gamma0):
    N1 = sum(gamma >= gamma0)
    N0 = sum(gamma < gamma0)

    u = betaincinv(N1+1, N0+1, (1. - Pconf)/2.)
    v = betaincinv(N1+1, N0+1, (1. + Pconf)/2.)
    
    mle = float(N1)/(N1+N0)
    
    return u, v, mle

In [21]:
#path_dat = "../simout-20170220-151223/dat"
path_dat = "../simout-20170221-115814/dat"
det = "ed"
#det = "mac_l25"
#det = "cav_l25"

gamma = loadtxt(path_dat + "/sim_micsoft_gaussian_noise_m100dbm_0000_fs1mhz_Ns25ks_%s_off.dat" % (det,))
gamma0 = get_gamma0(gamma)

Pin = arange(-140, -100, 1)
Pd = empty(shape=(Pin.shape[0],3))

for i in range(len(Pin)):
    gamma = loadtxt(path_dat + "/sim_micsoft_gaussian_noise_m100dbm_0000_fs1mhz_Ns25ks_%s_m%d_0dbm.dat" % (det, -Pin[i]))
    Pd[i,:] = get_Pd(gamma, gamma0)

fill_between(Pin, Pd[:,0], Pd[:,1], edgecolor='none', facecolor='lightgray')
plot(Pin, Pd[:,2], 'o-')
xlabel("Pin - input power")
ylabel("Pd - probability of detection")
grid()


We estimate the input power at which we get approximately the desired probability of detection. We'll do a simulation run with this input power.


In [22]:
Pdmin = .9

interp(Pdmin, Pd[:,2], Pin)


Out[22]:
-116.23140495867769

The actual test

This is the actual test. We repeat the $P_d$ estimation $N$ times.


In [23]:
#path_dat = "../simout-20170220-153817/dat"
#path_dat = "../simout-20170220-154821/dat"

#path_dat = "../simout-20170220-162521/dat"
#path_dat = "../simout-20170221-093117/dat"

path_dat = "../simout-20170221-122723/dat"
det = "ed"; p = "116"
#det = "mac_l25"; p = "118"
#det = "cav_l25"; p = "119"

In [24]:
N = 1000

gamma0 = empty(N)
Pd = empty(shape=(N,3))
Pd0 = empty(shape=(N,3))

for n in range(N):
    prefix = path_dat + "/sim_micsoft_gaussian_noise_m100dbm_%04d_fs1mhz_Ns25ks_%s_" % (n, det)

    gamma = loadtxt(prefix + "off.dat")
    gamma0[n] = get_gamma0(gamma)
    
    gamma = loadtxt(prefix + "m" + p + "_0dbm.dat")   
    Pd[n] = get_Pd(gamma, gamma0[n])
    Pd0[n] = get_Pd(gamma, gamma0[0])

In [25]:
hist(gamma0, bins=20);
xlabel("gamma0 - threshold")
ylabel("count")
grid()


This is the histogram of the $\gamma_0$ estimates - the uncertainty 1 on the list above.


In [26]:
h = hist(Pd[:,2], bins=20, color='b', alpha=.5, label='unc. 1+2')
hist(Pd0[:,2], bins=h[1], color='r', alpha=.5, label='unc. 2 only')

i = 0

bar([Pd[i,0]], [amax(h[0])], width=Pd[i,1]-Pd[i,0], facecolor='lightgray', edgecolor='none', zorder=-1)
plot([Pd[i,2], Pd[i,2]], [0, amax(h[0])], 'k-', lw=2)

xlabel("Pd - probability of detection")
ylabel("count")
legend()
grid()


This is the histogram of the uncerties in $P_d$.

Black line and the shaded area show a single MLE and the calculated confidence interval. This is an example of how the result would be represented in the paper if only uncertainty 2 would be estimated using the beta distribution.

Blue histogram shows distribution of estimates that fully account for uncertainties 1 and 2. This is the expected uncertainty of the actual measurements. It should mostly fall into the shaded area if that representation is OK to use.

Red histogram shows distribution of $P_d$ estimates when only uncertainty 2 is present. This is what the shadowed area actually represents.

We see that unc. 1+2 look similar and probably there is no big error if uncertainty is represented only using unc. 2 confidence interval.


In [27]:
plot(Pd[:100,2], 'o-')


Out[27]:
[<matplotlib.lines.Line2D at 0x7fe06812a750>]

We check here that we don't get rows of repeated values, due to identical random seeds in multiprocessing workers.

Uncertainty 1


In [28]:
i = 0
path = path_dat + "/sim_micsoft_gaussian_noise_m100dbm_%04d_fs1mhz_Ns25ks_%s_off.dat" % (i, det)
gamma = loadtxt(path)

In [51]:
from scipy.special import erfinv

gamma, edf = get_edf(gamma)
n = len(gamma)

# variance of the EDF
# From "Asymptotic properties of empirical distribution function"
sigma = sqrt(edf*(1 - edf)/n)

# confidence interval calculated from variance
e = sqrt(2)*sigma*erfinv(Pconf)

gamma0mu = interp(1. - Pfa, edf, gamma)
gamma0u = interp(1. - Pfa, edf+e, gamma)
gamma0v = interp(1. - Pfa, edf-e, gamma)

plot([gamma0mu, gamma0mu], [0, 1], 'k--')
plot([gamma0u, gamma0v], [1-Pfa, 1-Pfa], 'k-')

title("$n = %d$" % n)
fill_between(gamma, edf+e, edf-e, edgecolor='none', facecolor='lightgray')
plot(gamma, edf)
ylabel("EDF")
xlabel("$\gamma$")
grid()



In [56]:
h = hist(gamma0, bins=20, alpha=.5);

bar([gamma0u], [amax(h[0])], width=gamma0v-gamma0u, facecolor='lightgray', edgecolor='none', zorder=-1)
plot([gamma0mu, gamma0mu], [0, amax(h[0])], 'k-', lw=2)

xlabel("gamma0 - threshold")
ylabel("count")
grid()



In [55]:


In [ ]: